Maximizing surgical success in epilepsy: seizure-onset zone resection is crucial, while extending beyond offers no added benefit (supplementary file 2)
Chifaou Abdallah1,2,†,*, Zhengchen Cai1,†, Saba Rammal1, Olivier Aron3,4, Nicolás Von Ellenrieder1, Gang Chen5, Sana Hannan6, John Thomas7,8, Philippe Kahane9, Lorella Minotti9, Stephan Chabardes9, Sophie Colnat-Coulbois3,4, Louis Maillard3,4, Jeff Hall1, Francois Dubeau1, Jean Gotman1, Christophe Grova2,10,11,‡, Birgit Frauscher1,7,8,*,‡
1. Montreal Neurological Institute and Hospital, McGill University, Montréal, Québec H3A 2B4, Canada.
This supplementary material summarizes the analysis conducted for the manuscript “Maximizing surgical success in epilepsy: seizure-onset zone resection is crucial, while extending beyond offers no added benefit .”
Instructions: Each results section contains the corresponding computation code chunk. All code chunks and printed messages are folded by default. They can be unfolded to see the details. This provides a more coherent way of presenting results and code together. Highlighted text is in orange, and hyperlinks to references and sections are in light blue.
Load data and prepare variable values and levels. Only Engel I is considered as good outcome and all others are considered as poor outcome. All categorical variables are encoded as a factors. Note that for the convenience of setting prior and numerical stability of model fitting, the continuous variables are standardized.
Conventional statistical tests on categorical variables are conducted to see if the count distribution of two categorical variables are independent of each other. We report the results of the Pearson’s Chi-squared test (in subtitle). To interpret these results, for instance, Pearson’s χ2-test of independence reveals that, across 200 patients, there is no statistically significant association between focality and the surgical outcome. Actually, none of these categorical variables show statistically significant association with the surgical outcome, except for phase I-II.
1.2 Univariate analysis for continuous variables
Non-parametric between-group tests are used to examine whether continuous variables differ between surgical outcomes. The SOZ-resected fraction shows a considerable difference between good and poor outcomes, with a median difference of approximately 16%. In contrast, the differences in SOZ volume, cavity volume, and non-SOZ portion of the cavity between good and poor outcomes are subtle.
1.3 Multivariate analysis with all variables
Logistic regression is conducted using all variables, and ANOVA (Analysis of Variance) is performed to test the significance of each variable in the logistic regression. Only the SOZ-resected fraction and phase I-II are statistically significant, indicating these two variables have a significant positive association with the outcome.
In fact, Non-SOZ-resected fraction can be added along with SOZ and cavity volumes in the model considering: 1) it is convenient as the effects of SOZ-resected fraction and nonSOZ portion of cavity can be estimated from the same model and 2) it helps increase the precision of the effects estimation. If we denote SOZ-resected as “X”, Outcome as “Y”, and combine SOZVolume and CavityVolume into one node “Z”, while denoting unmeasured nodes (e.g., Surgery) as “U”, this is equivalent to the model 2 in Cinelli, Forney, and Pearl (2020), suggesting that Z is a good control. In the meantime, if we denote SOZ-resected as “X” and Non-SOZ-resected as “Z”, this is equivalent to the model 8 in Cinelli, Forney, and Pearl (2020), suggesting that Z is a good control, potentially improving precision. The following DAG query also shows that the nonSOZ portion of the cavity, SOZ, and cavity volumes are valid adjustment variables for the effect of the SOZ-resected fraction.
DAG query SOZResected \rightarrow Outcome direct effect
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd
------------------------------------------------------
SOZ-resected fraction | 3.71 | [1.22, 6.11] | 99.70%
2.2 Non-SOZ-resected fraction
The effect of Non-SOZ-resected fraction on outcome is subtle, indicating that a higher/lower nonSOZ portion of cavity may not increase/decrease the probability of a good outcome surgery. However, it would be risky to not explicitly describe this interpretation since we are solely referring the median effect value. The uncertainty of the estimated effect is large with a 95% credible interval. To be precise, the overall impact of the nonSOZ portion is generally subtle, but there is notable variation among individual patients.
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd
-----------------------------------------------------------
Non-SOZ-resected fraction | 0.75 | [-2.87, 4.36] | 65.54%
2.3 SOZ volume
The total effect of eloquent, epilepsy type, focality, phase I-II, LVFA and SOZ volume on the outcome can be estimated using one model shown in the DAG queries below. All these variables happen to be the adjustment variables for each particular effect of interest. While the DAG identifies necessary adjustment variables, the specific model structure also depends on model complexity and available data sample size. The rationale behind our model selection for the main result section is explained in detail in section 3.
DAG query SOZVolume \rightarrow Outcome total effect
mfx <- mod4rand3fix %>%avg_slopes(variables ="soz_vol")mfx <-posterior_draws(mfx) %>%mutate(value = draw / soz_vol_sd *100) %>%select(value) pp_table1 <- bayestestR::describe_posterior(mfx %>%rename("SOZ volume"= value), ci_method ="hdi", ci =0.95, test =c("p_direction"), null =0) p_tempa <-marginale_continuous_plot(mfx, 0, "\u2206Pr(good outcome) in %", "per unit \u2206 in SOZ volume","a | average marginal effect")mfx <-slopes(mod4rand3fix, variables ='soz_vol') %>%posterior_draws() %>%mutate(draw = draw / soz_vol_sd *100,estimate = estimate / soz_vol_sd *100) %>%mutate(soz_vol = soz_vol * soz_vol_sd + soz_vol_mean) p_tempb <-marginale_slop_plot(mfx, "soz_vol",parse(text ="SOZ~volume~(cm^3)"), "\u2206Pr(good outcome) in %", "b | marginal effect changes across the original data")mfx <-predictions(mod4rand3fix, by ='soz_vol') %>%posterior_draws() %>%mutate(soz_vol = soz_vol * soz_vol_sd + soz_vol_mean)p_tempc <-marginale_prediction_plot(mfx, "soz_vol",parse(text ="SOZ~volume~(cm^3)"), "Pr(good outcome)", "c | posterior predictive simulation with the original data")mapplot1 <-ggarrange(p_tempa$p_temp, p_tempb$p_temp, nrow =1, ncol =2,widths =c(0.6, 1))mapplot <-ggarrange(mapplot1, p_tempc$p_temp, nrow =2, ncol =1,widths =c(1, 1), heights =c(1, 1.1))p_temp_file <-tempfile(tmpdir ="./tmp", fileext ='.png')agg_png(p_temp_file, width =30, height =22.5, units ="in", res = figdpi)suppressWarnings(print(mapplot))invisible(dev.off())knitr::include_graphics(p_temp_file)
Summary of the distribution in panel a
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd
---------------------------------------------
SOZ volume | -0.78 | [-1.58, 0.00] | 97.32%
2.4 Cavity volume
With the current DAG, it is not possible to estimate the total effect of cavity size through adjustments. This is because there is no way to account for the backdoor path CavityVolume \leftarrow Surgery \rightarrow MissedSOZ \rightarrow Outcome when all the nodes in the path are unmeasured. This limitation is reasonable since we lack information about other factors related to the surgery that could potentially alter the cavity size and also influence the outcome. Without considering these additional sources, it is not possible to eliminate the potential confounding along this causal path.
DAG query CavityVolume \rightarrow Outcome total effect
print_adjustment_sets(adjustmentSets(dag_full, "CavityVolume", "Outcome", effect ="total"))[1] "No Way to Block Backdoor Paths"
2.5 Categorical variables
As shown in the DAG query results, the parameters in above SOZ volume model can also be used to interpret the effect of epilepsy type, focality, eloquent, phase I-II and LVFA.
No adjustment variable is required for the total effect of laterality on outcome.
DAG query Laterality \rightarrow Outcome total effect
mod_fit_overall <-brm(data = df_norm, family =bernoulli(link ="logit"), outcome_simplified ~1+ laterality,prior =c(prior(normal(0, 1.5), class = Intercept)),control =list(adapt_delta =0.99), iter =3500, warmup =1000, chains =4, cores =4, seed =19850913)mod_fit_overall
Model fitting results and diagnostics
Family: bernoulli
Links: mu = logit
Formula: outcome_simplified ~ 1 + laterality
Data: df_norm (Number of observations: 200)
Draws: 4 chains, each with iter = 3500; warmup = 1000; thin = 1;
total post-warmup draws = 10000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept 0.08 0.23 -0.37 0.53 1.00 5281
lateralityunilateral 0.27 0.29 -0.31 0.86 1.00 5463
Tail_ESS
Intercept 4579
lateralityunilateral 4420
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
mod_fit_overall <-brm(data = df_fit, family =bernoulli(link ="logit"), outcome_simplified ~1+ (1| pathology),prior =c(prior(normal(0, 1.5), class = Intercept),prior(normal(0, 0.3), class = sd)),control =list(adapt_delta =0.99), iter =3500, warmup =1000, chains =4, cores =4, seed =19850913)mod_fit_overall
Model fitting results and diagnostics
Family: bernoulli
Links: mu = logit
Formula: outcome_simplified ~ 1 + (1 | pathology)
Data: df_fit (Number of observations: 163)
Draws: 4 chains, each with iter = 3500; warmup = 1000; thin = 1;
total post-warmup draws = 10000
Multilevel Hyperparameters:
~pathology (Number of levels: 5)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.26 0.16 0.01 0.63 1.00 3588 3331
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.28 0.22 -0.15 0.70 1.00 5459 5458
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd
------------------------------------------
no | 0.53 | [0.42, 0.64] | 78.08%
yes | 0.59 | [0.51, 0.66] | 65.86%
overall | 0.57 | [0.50, 0.64] | 50.00%
3 Part III: Prior and model selections
As mentioned previously, for a categorical variable we model it as the intercept in the logistic regression. The model fitting results and diagnostics(Gabry et al. 2019) are also shown below with the estimated parameter distribution of every coefficient (e.g., posterior mean and its standard error). We can use epilepsy type as an example and the model code is shown below.
Model code
mod_fit_overall <-brm(data = df_norm, family =bernoulli(link ="logit"), outcome_simplified ~1+ epilepsy_type,prior =c(prior(normal(0, 1.5), class = b)),control =list(adapt_delta =0.99), iter =3500, warmup =1000, chains =4, cores =4, seed =19850913)
Note that we do choose normal(0, 1.5) as the prior for the intercept coefficients (unfold the above code to see it). Is not it too narrow? In other words, we may be challenged that this prior could “bias” the model in the way that it is subjectively narrow. Let us explore this prior to see if it is “subjective”. We start with a flat prior like normal(0, 500) which is very wide and can be considered as “objective” since it says we almost know nothing about this parameter, it could be any value from this wide range distribution and the mean of these values is 0. Then we use this prior and see what would be the probability of achieving the good surgical outcome. This technique in the Bayesian workflow is called prior predictive simulation/check(Gelman et al. 2020) which is used to valid the prior choice by simulations without any real data. Additionally, this highlights another advantage of the Bayesian approach over conventional methods. Bayesian inference encourages us to explicitly state, consider, and describe our assumptions, which are then encapsulated within the model equations and priors.
To conduct prior predictive simulation, we calculate the output of the model (good or bad surgical outcome) by giving one value of the intercept randomly chosen from the “objective” prior. Then repeat this process many times (i.e., 16000) to see the probability of good surgical outcome calculated using all outputs of the logistic regression equations. Since we do not use any data and prior is “objective”, we should expect the distribution of the probability of the good surgical outcome being centered at around 0.5 and has no obvious preference on any probability value. Surprisingly, as shown below this “objective” prior prefers good outcome probability to be either close to 0 or 1, and the probability mass near 0.5 is very low. This so called “objective” prior is not “objective”.
How about the prior we selected in the model - normal(0, 1.5)? Does it produce fair predictions without any data? As shown below, this prior provides much more reasonable outcome probability than the imagined “objective” prior above. It does not prefer any particular probability value from 0 to 1. Keep in mind that this is without any real data, a fair model should not have preference before knowing the data if objectivity is the major consideration. When fitting the model with data we are actually balancing the prior and data, if a prior is bad and sample size is small (less informative), wrong inference can be made. This is one reason why Bayesian methods are effective for handling small sample sizes: we validate the prior rather than assuming it is inherently fair. Note that this does not mean the conventional logistic regression is wrong. It is correct when we have infinite data (i.e., infinite information), or asymptotically infinite data, since prior does not matter anymore in this situation. Note that this normal(0, 1.5) prior is referred to as the weakly informative prior considering it is neither flat nor informative (e.g., setting the prior close to the expected value).
3.1 The SOZ-resected fraction model
Prior predictive simulation code
mod_fit_causal_flatprior <-brm(data = df_norm,family =bernoulli(link ="logit"), outcome_simplified ~1+ cavity_vol + soz_vol + overlap_ratio,prior =c(prior(normal(0, 500), class = Intercept),prior(normal(0, 500), class = b)), iter =5000,warmup =1000,chains =4,cores =4,seed =19850913,sample_prior ="only")
For our prior chosen based on prior predictive checks - normal(0, 0.3), it provides a good outcome probability distribution that is centered around 0.5, and gradually decays to extremely high or low probability values.
Prior predictive simulation code
mod_fit_causal_prior <-brm(data = df_norm,family =bernoulli(link ="logit"), outcome_simplified ~1+ cavity_vol + soz_vol + overlap_ratio,prior =c(prior(normal(0, 1.5), class = Intercept),prior(normal(0, 0.5), class = b)), iter =5000,warmup =1000,chains =4,cores =4,seed =19850913,sample_prior ="only")
3.2 The SOZ volume model
In the SOZ volume model, the flat priors decays faster to extreme values (high or low) compared to the weakly informative priors used in our analysis.
There were many potential approaches to building a model considering SOZ volume, epilepsy type, focality, eloquent, phase I-II, LVFA and pathology. We could have included complex interactions between categorical variables, but choose to prioritize reliable estimation given our data size. Therefore, our main results model focuses on interactions between SOZ volume and other factors. Additionally, we apply hierarchical structure only to factors with more than two levels (epilepsy type, focality, phase I-II, and pathology), as two levels are insufficient for hierarchical modeling. These decisions reflect a balance between model complexity and accuracy given the available data.
We rigorously justify the model selection by building six alternative models with varying interactions (none, only 7-way, or all possible 1-7 interactions) and hierarchical structures (complete pooling, or partial pooling). To assess generalizability, we employed Bayesian Leave-One-Out Cross-Validation (LOO-CV see Vehtari, Gelman, and Gabry 2017) with Pareto smoothed importance sampling (PSIS). The diagnostic of PSIS also provides an assessment of model misspecification and erroneous data points (Gabry et al. 2019). Following the criteria suggested in Sivula et al. (2020), we compared models based on their expected log pointwise predictive density. The differences in expected log pointwise predictive density are less than 4 indicating that the models do not differ statistically in terms of generalizability. The more complicated models may have the misspecification issues indicated by the diagnostic of PSIS (e.g., warning message showing pareto_k > 0.7). Bayes R² is also used to assess how well the model fit the data.
Warning: Found 1 observations with a pareto_k > 0.7 in model
'mod7wayinteractions'. We recommend to set 'moment_match = TRUE' in order to
perform moment matching for problematic observations.
Warning: Found 1 observations with a pareto_k > 0.7 in model
'mod7factor7wayinteractions'. We recommend to set 'moment_match = TRUE' in
order to perform moment matching for problematic observations.
Warning: Found 49 observations with a pareto_k > 0.7 in model 'modmonster'. We
recommend to set 'moment_match = TRUE' in order to perform moment matching for
problematic observations.
According to the warning messages above, some more complicated models exhibit pathological diagnostics, indicating either model misspecification or erroneous data points. Therefore, these models are excluded from the model selection.
As shown below, no model stand out statistically in terms of the predictability/generalizability (no difference larger than 4). Note that the table below used the 1st model as the reference and computed the difference in expected log pointwise predictive density for other models. All models are ranked by the predictability from the best to the worst.
As no model shows significantly better generalizability, we opt for the one with the highest Bayes R² value, the mod4rand3fix model, indicating the best fit to the data (see the table below). Its hierarchical structure is also a more reasonable choice, given the sample size at each level and the total number of factors in the model.
Cinelli, Carlos, Andrew Forney, and Judea Pearl. 2020. “A Crash Course in Good and Bad Controls.”SSRN Electronic Journal, 1–27. https://doi.org/10.2139/ssrn.3689437.
Gabry, Jonah, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman. 2019. “Visualization in Bayesian Workflow.”Journal of the Royal Statistical Society. Series A: Statistics in Society 182 (2): 389–402. https://doi.org/10.1111/rssa.12378.
Gelman, Andrew, Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul Christian Bürkner, and Martin Modrák. 2020. “Bayesian Workflow.”arXiv.
Vehtari, Aki, Andrew Gelman, and Jonah Gabry. 2017. “Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC.”Statistics and Computing 27 (5): 1413–32. https://doi.org/10.1007/s11222-016-9696-4.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Code created by Zhengchen Cai, Email: zhengchen.cai@mcgill.ca